20
2. the frequency selectivity of the auditory system: our ability to perceive two similar frequencies as
distinct;
3. the modeling of the auditory system as a bank of auditory filters;
4. the perception of loudness;
5. the perception of rhythm.
6.2) Perception of frequencies
The auditory system, like the visual system, is able to detect frequencies over a wide range of scales. In
order to measure frequencies over a very large range, it operates using a logarithmic scale. Let us
consider a pure tone, modeled by a sinusoidal signal oscillating at a frequency f. If f < 500 Hz, then the
perceived tone – or pitch – varies as a linear function of f. When f > 1, 000 Hz, then the perceived pitch
increases logarithmically with f. Several frequency scales have been proposed to capture the logarithmic
scaling of frequency perception.
6.3) The mel/Bark scale
The Bark scale (named after the German physicist Barkhausen) is defined as
where f is measured in Hz. The mel-scale is defined by the fact that 1 bark = 100 mel. In this lab we will
use a slightly modified version of the mel scale defined by
Note that, by construction, the m value corresponding to 1000 Hz is 1000 (the logarithm in the above
formula is the natural log)
6.4) The cochlear filterbank
Finally, we need to account for the fact that the auditory system behaves as a set of filters, i.e. as a
filterbank, whose filters have overlapping frequency responses. For each filter, the range of frequencies
over which the filter response is significant is called a critical band. A critical band is a band of audio
frequencies within which a second tone will interfere with the first via auditory masking.
Our perception of pitch can be quantified using the total energy at the output of each filter. All spectral
energy that falls into one critical band is summed up, leading to a single number for that band.
We describe in the following a simple model of the cochlear filterbank. The filter bank is constructed
using N
B
= 40 logarithmically spaced triangle filters centered at the frequencies Ω
p
, which are implicitly
defined by